Fife
Optimizing Kernel Discrepancies via Subset Selection
Chen, Deyao, Clément, François, Doerr, Carola, Kirk, Nathan
Kernel discrepancies are a powerful tool for analyzing worst-case errors in quasi-Monte Carlo (QMC) methods. Building on recent advances in optimizing such discrepancy measures, we extend the subset selection problem to the setting of kernel discrepancies, selecting an m-element subset from a large population of size $n \gg m$. We introduce a novel subset selection algorithm applicable to general kernel discrepancies to efficiently generate low-discrepancy samples from both the uniform distribution on the unit hypercube, the traditional setting of classical QMC, and from more general distributions $F$ with known density functions by employing the kernel Stein discrepancy. We also explore the relationship between the classical $L_2$ star discrepancy and its $L_\infty$ counterpart.
Rogue planet is gobbling up 6.6 billion tons of dust per second
Science Space Deep Space Exoplanets Rogue planet is gobbling up 6.6 billion tons of dust per second The cosmic oddities experience their own growth spurts. Breakthroughs, discoveries, and DIY tips sent every weekday. About 620 light-years from Earth, a gigantic rogue proto-planet is currently devouring 6.6 billion tons of dust and gas per second. Based on recent observations, the relatively new resident of the Chamaeleon constellation isn't stopping anytime soon--and the situation may get even more intense. But according to astronomers, that may be pretty standard behavior for these cosmic objects.
A TISS: Autoregressive Transformers for Indoor Scene Synthesis Despoina Paschalidou
We argue that this formulation is more natural, as it makes A TISS generally useful beyond fully automatic room layout synthesis. For example, the same trained model can be used in interactive applications for general scene completion, partial room rearrangement with any objects specified by the user, as well as object suggestions for any partial room.
Dynamic Similarity Graph Construction with Kernel Density Estimation
Laenen, Steinar, Macgregor, Peter, Sun, He
In the kernel density estimation (KDE) problem, we are given a set $X$ of data points in $\mathbb{R}^d$, a kernel function $k: \mathbb{R}^d \times \mathbb{R}^d \rightarrow \mathbb{R}$, and a query point $\mathbf{q} \in \mathbb{R}^d$, and the objective is to quickly output an estimate of $\sum_{\mathbf{x} \in X} k(\mathbf{q}, \mathbf{x})$. In this paper, we consider $\textsf{KDE}$ in the dynamic setting, and introduce a data structure that efficiently maintains the estimates for a set of query points as data points are added to $X$ over time. Based on this, we design a dynamic data structure that maintains a sparse approximation of the fully connected similarity graph on $X$, and develop a fast dynamic spectral clustering algorithm. We further evaluate the effectiveness of our algorithms on both synthetic and real-world datasets.
Primer C-VAE: An interpretable deep learning primer design method to detect emerging virus variants
Wang, Hanyu, Tsinda, Emmanuel K., Dunn, Anthony J., Chikweto, Francis, Zemkoho, Alain B.
Motivation: PCR is more economical and quicker than Next Generation Sequencing for detecting target organisms, with primer design being a critical step. In epidemiology with rapidly mutating viruses, designing effective primers is challenging. Traditional methods require substantial manual intervention and struggle to ensure effective primer design across different strains. For organisms with large, similar genomes like Escherichia coli and Shigella flexneri, differentiating between species is also difficult but crucial. Results: We developed Primer C-VAE, a model based on a Variational Auto-Encoder framework with Convolutional Neural Networks to identify variants and generate specific primers. Using SARS-CoV-2, our model classified variants (alpha, beta, gamma, delta, omicron) with 98% accuracy and generated variant-specific primers. These primers appeared with >95% frequency in target variants and <5% in others, showing good performance in in-silico PCR tests. For Alpha, Delta, and Omicron, our primer pairs produced fragments <200 bp, suitable for qPCR detection. The model also generated effective primers for organisms with longer gene sequences like E. coli and S. flexneri. Conclusion: Primer C-VAE is an interpretable deep learning approach for developing specific primer pairs for target organisms. This flexible, semi-automated and reliable tool works regardless of sequence completeness and length, allowing for qPCR applications and can be applied to organisms with large and highly similar genomes.
On the Importance of Reward Design in Reinforcement Learning-based Dynamic Algorithm Configuration: A Case Study on OneMax with (1+($\lambda$,$\lambda$))-GA
Nguyen, Tai, Le, Phong, Biedenkapp, André, Doerr, Carola, Dang, Nguyen
Dynamic Algorithm Configuration (DAC) has garnered significant attention in recent years, particularly in the prevalence of machine learning and deep learning algorithms. Numerous studies have leveraged the robustness of decision-making in Reinforcement Learning (RL) to address the optimization challenges associated with algorithm configuration. However, making an RL agent work properly is a non-trivial task, especially in reward design, which necessitates a substantial amount of handcrafted knowledge based on domain expertise. In this work, we study the importance of reward design in the context of DAC via a case study on controlling the population size of the $(1+(\lambda,\lambda))$-GA optimizing OneMax. We observed that a poorly designed reward can hinder the RL agent's ability to learn an optimal policy because of a lack of exploration, leading to both scalability and learning divergence issues. To address those challenges, we propose the application of a reward shaping mechanism to facilitate enhanced exploration of the environment by the RL agent. Our work not only demonstrates the ability of RL in dynamically configuring the $(1+(\lambda,\lambda))$-GA, but also confirms the advantages of reward shaping in the scalability of RL agents across various sizes of OneMax problems.
A Survey of QUD Models for Discourse Processing
Question Under Discussion (QUD), which is originally a linguistic analytic framework, gains increasing attention in the community of natural language processing over the years. Various models have been proposed for implementing QUD for discourse processing. This survey summarizes these models, with a focus on application to written texts, and examines studies that explore the relationship between QUD and mainstream discourse frameworks, including RST, PDTB and SDRT. Some questions that may require further study are suggested.
WatchGuardian: Enabling User-Defined Personalized Just-in-Time Intervention on Smartwatch
Lei, Ying, Cao, Yancheng, Wang, Will, Dong, Yuanzhe, Yin, Changchang, Cao, Weidan, Zhang, Ping, Yang, Jingzhen, Yao, Bingsheng, Peng, Yifan, Weng, Chunhua, Auerbach, Randy, Mamykina, Lena, Wang, Dakuo, Wang, Yuntao, Xu, Xuhai
While just-in-time interventions (JITIs) have effectively targeted common health behaviors, individuals often have unique needs to intervene in personal undesirable actions that can negatively affect physical, mental, and social well-being. We present WatchGuardian, a smartwatch-based JITI system that empowers users to define custom interventions for these personal actions with a small number of samples. For the model to detect new actions based on limited new data samples, we developed a few-shot learning pipeline that finetuned a pre-trained inertial measurement unit (IMU) model on public hand-gesture datasets. We then designed a data augmentation and synthesis process to train additional classification layers for customization. Our offline evaluation with 26 participants showed that with three, five, and ten examples, our approach achieved an average accuracy of 76.8%, 84.7%, and 87.7%, and an F1 score of 74.8%, 84.2%, and 87.2% We then conducted a four-hour intervention study to compare WatchGuardian against a rule-based intervention. Our results demonstrated that our system led to a significant reduction by 64.0 +- 22.6% in undesirable actions, substantially outperforming the baseline by 29.0%. Our findings underscore the effectiveness of a customizable, AI-driven JITI system for individuals in need of behavioral intervention in personal undesirable actions. We envision that our work can inspire broader applications of user-defined personalized intervention with advanced AI solutions.
Text-to-Image Generation for Vocabulary Learning Using the Keyword Method
Attygalle, Nuwan T., Kljun, Matjaž, Quigley, Aaron, Pucihar, Klen čOpič, Grubert, Jens, Biener, Verena, Leiva, Luis A., Yoneyama, Juri, Toniolo, Alice, Miguel, Angela, Kato, Hirokazu, Weerasinghe, Maheshya
The 'keyword method' is an effective technique for learning vocabulary of a foreign language. It involves creating a memorable visual link between what a word means and what its pronunciation in a foreign language sounds like in the learner's native language. However, these memorable visual links remain implicit in the people's mind and are not easy to remember for a large set of words. To enhance the memorisation and recall of the vocabulary, we developed an application that combines the keyword method with text-to-image generators to externalise the memorable visual links into visuals. These visuals represent additional stimuli during the memorisation process. To explore the effectiveness of this approach we first run a pilot study to investigate how difficult it is to externalise the descriptions of mental visualisations of memorable links, by asking participants to write them down. We used these descriptions as prompts for text-to-image generator (DALL-E2) to convert them into images and asked participants to select their favourites. Next, we compared different text-to-image generators (DALL-E2, Midjourney, Stable and Latent Diffusion) to evaluate the perceived quality of the generated images by each. Despite heterogeneous results, participants mostly preferred images generated by DALL-E2, which was used also for the final study. In this study, we investigated whether providing such images enhances the retention of vocabulary being learned, compared to the keyword method only. Our results indicate that people did not encounter difficulties describing their visualisations of memorable links and that providing corresponding images significantly improves memory retention.